[1] "Z: 10, W: 1.01"
Bayesianism main Ideas:
data \(X\) is fixed, and the parameters \(\theta\) of our process \(P_{\theta}\) are random
inference relies on the idea of updating prior beliefs based on evidence from the data
probabilities are used to quantify uncertainty we have about parameters
\[ \underbrace{p(\theta|d)}_\text{posterior} = \underbrace{\frac{p(d|\theta)}{p(d)}}_\text{update} \times \underbrace{p(\theta)}_\text{prior} \]
\[ P(A \mid B) = \frac{P(A\cup B)}{P(B)} = \frac{P(B \mid A) \cdot P(A)}{P(B)} \]
Bayes’ Rule is a way to calculate conditional probabilities
\[ P(\text{covid} \mid +) = \frac{P(+ \mid \text{covid}) \cdot P(\text{covid})}{P(+)} \]
What is the probability of having covid given that you got a positive covid test?
\[ P(\theta \mid \text{data}) = \frac{P(\text{data} \mid \theta) \cdot P(\theta)}{P(\text{data})} \]
“How did my theory change after seeing the data?”
\[ \color{#D55E00}{P(\theta \mid \text{data})} = \frac{P(\text{data} \mid \theta) \cdot P(\theta)}{P(\text{data})} \]
The Posterior is the probability of \(\theta\) after seeing the data.
\[ P(\theta \mid \text{data}) = \frac{P(\text{data} \mid \theta) \cdot \color{#CC79A7}{P(\theta)}}{P(\text{data})} \]
The Prior is the probability distribution of \(\theta\) before seeing the data.
\[ P(\theta \mid \text{data}) = \frac{\color{#F5C710}{P(\text{data} \mid \theta)} \cdot P(\theta)}{P(\text{data})} \]
The Likelihood is the probability of our data, given \(\theta\), for various different \(\theta\)s
:::
\[ P(\theta \mid \text{data}) = \frac{P(\text{data} \mid \theta) \cdot P(\theta)}{\color{#009E73}{P(\text{data})}} \]
The Normalizing Constant is the probability of our data.It normalizes the posterior so that it’s a valid probability (distribution).
It does not matter.
\[ P(\theta \mid \text{data}) = \frac{P(\text{data} \mid \theta) \cdot P(\theta)}{\color{#009E73}{P(\text{data})}} \]
The Normalizing Constant is the probability of our data.It normalizes the posterior so that it’s a valid probability (distribution).
The normalizing constant makes \(P(\theta \mid \text{data})\) a valid probability distribution (i.e. \(\int P(\theta \mid \text{data}) d \theta = 1\)) but, it’s just a scalar constant…so \(P(\text{data} \mid \theta) \cdot P(\theta) \propto P(\theta \mid \text{data})\) 👀
\[ \left[P(\text{data} \mid \theta) \cdot P(\theta) \right] \propto P(\theta \mid \text{data}) \]
\[ \text{likelihood} \cdot \text{prior} \propto \text{posterior} \]
we have a function \(f(x) = P(\text{data} \mid \theta) \cdot P(\theta)\) that is proportional to a probability distribution \(p(x) = P(\theta \mid \text{data})\) that we want to sample from, but it itself is not a proper probability distribution…
❓ What does that remind you of?
Note: If we have draws from our posterior distribution \(p(x) = P(\theta \mid \text{data})\), we can use these draws to calculate any statistic we want: mean, median, quantiles on the draws or transformations of the draws.
[1] "Z: 10, W: 1.01"
\[ P(\text{flu} \mid \text{+}) = \frac{P(\text{+} \mid \text{flu}) \cdot P(\text{flu})}{{P(\text{+})}} \]
\(P(\text{flu}) = 0.05\) (prevalence of flu)
\(P(\text{+} \mid \text{flu}) = 0.99\) (sensitivity of test)
\(P(\text{+} \mid \text{no flu}) = 0.1\) (1- specificity of test)
\(P(\text{+}) = \underbrace{P(\text{+} \mid \text{flu})\cdot P(\text{flu})}_\text{way 1}+ \underbrace{P(\text{+} \mid \text{no flu})\cdot P(\text{no flu})}_\text{way 2}\)
We’re interested in estimating \(q\) the proportion of days it rains in California. It rained 12 of the last 365 days.
Binomial Likelihood: \(\mathcal{L}(q \mid x) = \binom{n}{x} q^{x} (1 - q)^{n - x}\)
Beta Prior: \(p \sim \text{Beta}(\alpha, \beta)= \frac{q^{\alpha-1} (1-q)^{\beta-1}}{B(\alpha,\beta)}\)
Beta Prior Shiny App
Play around with the app for a minute changing alpha and beta until you find a prior that looks reasonable to you.
We’re interested in estimating \(q\) the proportion of days it rains in California. It rained 12 of the last 365 days (\(x\)).
Binomial Likelihood: \(\mathcal{L}(q \mid x) = \binom{365}{12} q^{12} (1 - q)^{365-12}\)
Beta Prior: \(q \sim \text{Beta}(1, 9) = \frac{q^{1-1} (1-q)^{9-1}}{B(1,9)}\)
Remember: \(P(q \mid x) \underbrace{\propto \binom{365}{12} q^{12} (1 - q)^{365-12}}_\text{likelihood} \times \underbrace{\frac{q^{1-1} (1-q)^{9-1}}{B(1,9)}}_\text{prior}\)